Skip to content

feat: add partitioned namespace#5896

Open
wojiaodoubao wants to merge 2 commits intolance-format:mainfrom
wojiaodoubao:partitioned-namespace-new
Open

feat: add partitioned namespace#5896
wojiaodoubao wants to merge 2 commits intolance-format:mainfrom
wojiaodoubao:partitioned-namespace-new

Conversation

@wojiaodoubao
Copy link
Contributor

This is a sub-task of the partitioned namespace

@github-actions github-actions bot added enhancement New feature or request python java labels Feb 5, 2026
@wojiaodoubao wojiaodoubao mentioned this pull request Feb 5, 2026
4 tasks
@wojiaodoubao wojiaodoubao force-pushed the partitioned-namespace-new branch from 23d594c to b61565e Compare February 5, 2026 13:44
@wojiaodoubao wojiaodoubao marked this pull request as draft February 5, 2026 14:49
@wojiaodoubao wojiaodoubao force-pushed the partitioned-namespace-new branch from b61565e to dc95f18 Compare February 5, 2026 15:45
@jackye1995 jackye1995 self-requested a review February 6, 2026 05:02

/// Request for creating multiple namespaces with a single merge insert.
#[derive(Debug, Clone)]
pub struct CreateMultiNamespacesRequest {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we need this? I thought we only need creating multiple tables in a namespace

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh I see nvm because we want to create the namespaces that represent partition values.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should add these as a part of the Lance Namespace operations, introduce BatchCreateNamespaces, BatchCreateTables, etc., those operations can be useful anyway even outside the context of partitioned namespace.

And then you don't need a dedicated extension just for manifest namespace to make it work.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should add these as a part of the Lance Namespace operations, introduce BatchCreateNamespaces, BatchCreateTables, etc.

Agree, we can do this.

}

#[derive(Debug, Default, Clone)]
pub struct CreateMultiNamespacesRequestBuilder {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just to create a separated thread. I am starting to think, is there real benefit in creating the sub-namespace structures? It seems purely for the purpose that it is cool to list namespaces in this way, but it does not serve any practical purposes since all the pruning are done directly against the table's partition column values in __manifest. Would it make more sense to just not have those nested namespace structures?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, having a table is sufficient for partition creation and pruning. The reason for retaining the namespace is that PartitionedNamespace is a type of DirectoryNamespace that follows to the partition spec standard. If we only keep the table part, PartitionedNamespace can no longer be treated as a normal DirectoryNamespace.

I think from a consistency perspective, it would be better to retain it. Shall we retain it, or remove it for simplicity?

@wojiaodoubao
Copy link
Contributor Author

I just finished the first version of partitioned namespace and it is ready for review now. When I implemented it, I found the original design need some update, here is the reason.

Explanation of the New Design

Compared to the previous partitioning spec, the current implementation introduces two key changes. Below I’ll explain the motivation behind these adjustments and would love to get further feedback and discussion.

1. Introducing TableSpec

In the earlier partition spec design, all tables shared the same schema, which was stored in the table metadata. A PartitionedNamespace contained multiple sub-namespaces named v{i}, representing different versions of the partition spec.

The problem with that design is that it makes schema evolution difficult. We cannot atomically modify the schema of all tables without breaking the semantic guarantee that “all tables share the same schema.”

Even if we support multi-partition transactions in the future, we would still be blocked at the step of updating the schema stored in table metadata, because updating table metadata itself does not have transactional semantics.

To solve this problem, I propose introducing a new abstraction: TableSpec, which encapsulates both the schema and the partition spec.

It is defined as:

pub struct TableSpec {
    id: i32,
    schema: ArrowSchema,
    partition_spec: PartitionSpec,
}

TableSpec is stored as a first-level sub-namespace under the PartitionedNamespace root. The v{i} naming convention corresponds to the id, while the schema and partition_spec are stored in the namespace properties.

Whenever a schema evolution or partition spec evolution occurs, we create a new sub-namespace. This operation is purely metadata-level. In manifest namespace, creating a namespace is atomic.

New data must follow the new schema (i.e., tables are created and written under the new TableSpec namespace).

Existing data does not need to be modified and continues to use the previous schema and partition spec.

Another advantage of this design is that it makes it relatively straightforward to support branching functionality in the future.

The overall structure can be visualized as follows:
image

2. Using Deterministic Names for Namespaces

In the previous partition spec design, the name of a partition namespace was a 16-character base36 string. Any type of partition value would be mapped to such a 16-character base36 string.

The issue with this design is that it makes it difficult to resolve concurrency conflicts. In distributed scenarios, we may need to concurrently create tables and namespaces with the same partition values. Since partition values effectively serve as business primary keys, this becomes a consistency challenge.

Originally, I considered leveraging Lance’s merge-insert deduplication capability to enforce uniqueness at the business key level. However, merge-insert in Lance requires that:

  • A column be explicitly defined as a primary key
  • The primary key column must be non-null

In our current design, the __manifest partition field column must be nullable. Additionally, partition fields can evolve over time, meaning we cannot reliably define a fixed initial primary key column. This makes it infeasible to use merge-insert deduplication directly on partition fields.

To address this, I propose using deterministic names for partition namespaces. At each level, the namespace name is derived from the serialized string representation of the partition value.

With this approach:

  • Namespace identity becomes deterministic and directly tied to partition values.
  • We only need to define a primary key on the object_id column.
  • We can then leverage merge-insert to resolve concurrency conflicts when creating new namespaces and tables.

This simplifies concurrency handling while preserving correctness in distributed environments.

@wojiaodoubao wojiaodoubao force-pushed the partitioned-namespace-new branch from fadadad to 7940d6b Compare February 25, 2026 15:53
@codecov
Copy link

codecov bot commented Feb 25, 2026

@wojiaodoubao wojiaodoubao marked this pull request as ready for review February 26, 2026 06:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request java python

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants